智能论文笔记

PeCLR: Self-Supervised 3D Hand Pose Estimation from monocular RGB via Contrastive Learning

Adrian Spurr , Aneesh Dahiya , Xi Wang , Xucong Zhang , Otmar Hilliges

分类：计算机视觉

2021-06-10

对图像分类任务的对比学习成功的鼓励，我们为3D手姿势估计的结构化回归任务提出了一种新的自我监督方法。对比学习利用未标记的数据来通过损失制定来使用未标记的数据，以鼓励学习的特征表示在任何图像转换下都是不变的。对于3D手姿势估计，它也希望具有不变性地与诸如颜色抖动的外观变换。但是，该任务需要在仿射和转换之类的转换下的标准性。为了解决这个问题，我们提出了一种对比的对比目标，并在3D手姿势估计的背景下展示其有效性。我们通过实验研究了不变性和对比的对比目标的影响，并表明学习的等待特征导致3D手姿势估计的任务的更好表示。此外，我们显示具有足够深度的标准Evenet，在额外的未标记数据上培训，在弗雷手中获得高达14.5％的提高，因此在没有任何任务的专用架构的情况下实现最先进的性能。 https://ait.ethz.ch/projects/2021/peclr/使用代码和模型

translated by 谷歌翻译

An Experience-based Direct Generation approach to Automatic Image Cropping

Casper Christensen , Aneesh Vartakavi

分类：计算机视觉 | 机器学习

2022-12-30

Automatic Image Cropping is a challenging task with many practical downstream applications. The task is often divided into sub-problems - generating cropping candidates, finding the visually important regions, and determining aesthetics to select the most appealing candidate. Prior approaches model one or more of these sub-problems separately, and often combine them sequentially. We propose a novel convolutional neural network (CNN) based method to crop images directly, without explicitly modeling image aesthetics, evaluating multiple crop candidates, or detecting visually salient regions. Our model is trained on a large dataset of images cropped by experienced editors and can simultaneously predict bounding boxes for multiple fixed aspect ratios. We consider the aspect ratio of the cropped image to be a critical factor that influences aesthetics. Prior approaches for automatic image cropping, did not enforce the aspect ratio of the outputs, likely due to a lack of datasets for this task. We, therefore, benchmark our method on public datasets for two related tasks - first, aesthetic image cropping without regard to aspect ratio, and second, thumbnail generation that requires fixed aspect ratio outputs, but where aesthetics are not crucial. We show that our strategy is competitive with or performs better than existing methods in both these tasks. Furthermore, our one-stage model is easier to train and significantly faster than existing two-stage or end-to-end methods for inference. We present a qualitative evaluation study, and find that our model is able to generalize to diverse images from unseen datasets and often retains compositional properties of the original images after cropping. Our results demonstrate that explicitly modeling image aesthetics or visual attention regions is not necessarily required to build a competitive image cropping algorithm.

translated by 谷歌翻译

Metaheuristic for Hub-Spoke Facility Location Problem: Application to Indian E-commerce Industry

Aakash Sachdeva , Bhupinder Singh , Rahul Prasad , Nakshatra Goel , Ronit Mondal , Jatin Munjal , Abhishek Bhatnagar , Manjeet Dahiya

分类：机器学习

2022-12-16

Indian e-commerce industry has evolved over the last decade and is expected to grow over the next few years. The focus has now shifted to turnaround time (TAT) due to the emergence of many third-party logistics providers and higher customer expectations. The key consideration for delivery providers is to balance their overall operating costs while meeting the promised TAT to their customers. E-commerce delivery partners operate through a network of facilities whose strategic locations help to run the operations efficiently. In this work, we identify the locations of hubs throughout the country and their corresponding mapping with the distribution centers. The objective is to minimize the total network costs with TAT adherence. We use Genetic Algorithm and leverage business constraints to reduce the solution search space and hence the solution time. The results indicate an improvement of 9.73% in TAT compliance compared with the current scenario.

translated by 谷歌翻译

A Survey of Multi-Agent Human-Robot Interaction Systems

Abhinav Dahiya , Alexander M. Aroyo , Kerstin Dautenhahn , Stephen L. Smith

分类：机器人

2022-12-10

This article presents a survey of literature in the area of Human-Robot Interaction (HRI), specifically on systems containing more than two agents (i.e., having multiple humans and/or multiple robots). We identify three core aspects of ``Multi-agent" HRI systems that are useful for understanding how these systems differ from dyadic systems and from one another. These are the Team structure, Interaction style among agents, and the system's Computational characteristics. Under these core aspects, we present five attributes of HRI systems, namely Team size, Team composition, Interaction model, Communication modalities, and Robot control. These attributes are used to characterize and distinguish one system from another. We populate resulting categories with examples from recent literature along with a brief discussion of their applications and analyze how these attributes differ from the case of dyadic human-robot systems. We summarize key observations from the current literature, and identify challenges and promising areas for future research in this domain. In order to realize the vision of robots being part of the society and interacting seamlessly with humans, there is a need to expand research on multi-human -- multi-robot systems. Not only do these systems require coordination among several agents, they also involve multi-agent and indirect interactions which are absent from dyadic HRI systems. Adding multiple agents in HRI systems requires advanced interaction schemes, behavior understanding and control methods to allow natural interactions among humans and robots. In addition, research on human behavioral understanding in mixed human-robot teams also requires more attention. This will help formulate and implement effective robot control policies in HRI systems with large numbers of heterogeneous robots and humans; a team composition reflecting many real-world scenarios.

translated by 谷歌翻译

Temporal Analysis on Topics Using Word2Vec

Angad Sandhu , Aneesh Edara , Faizan Wajid , Ashok Agrawala

分类：自然语言处理

2022-09-23

本研究提出了一种新颖的趋势检测和可视化方法 - 更具体地说，随着时间的推移，主题的变化建模。如果当前用于识别和可视化趋势的模型仅传达基于用法随机计数的单一单词的普及，那么本研究中的方法说明了一个主题正在发展的普及和方向。在这种情况下，方向是选定语料库中的独特亚主题。通过使用K-均值聚类和余弦相似性对主题的移动进行建模来对这种趋势进行建模，以将簇之间的距离分组。在收敛的场景中，可以推断出整个主题是在网络上的（主题之间的令牌，可以互换）。相反，一个不同的场景暗示每个主题的各自的令牌在相同的上下文中都不会找到（彼此之间越来越不同）。该方法对20个新闻组数据集中存在的各种媒体房屋的一组文章进行了测试。

translated by 谷歌翻译

Scheduling Operator Assistance for Shared Autonomy in Multi-Robot Teams

Yifan Cai , Abhinav Dahiya , Nils Wilde , Stephen L. Smith

分类：机器人

2022-09-07

在本文中，我们考虑了在具有多个自动机器人的系统中分配人类操作员协助的问题。每个机器人都需要完成独立任务，每个任务定义为一系列任务。在执行任务时，机器人可以自主操作，也可以由人类操作员远程执行，以更快地完成任务。我们表明，创建详细时间表的问题使系统的制造量最小化是NP-HARD。我们将问题提出为混合整数线性程序，可用于最佳地解决小到中等大小的问题实例。我们还开发了一种随时随地的算法，该算法利用问题结构来提供对操作员调度问题的快速和高质量解决方案，即使对于更大的问题实例也是如此。我们的关键见解是在贪婪创建的时间表中识别阻止任务，并迭代地删除这些块以提高解决方案的质量。通过数值模拟，我们证明了所提出的算法的好处是一种高于其他贪婪方法的有效且可扩展的方法。

translated by 谷歌翻译

Friendliness Of Stack Overflow Towards Newbies

Aneesh Tickoo , Shweta Chauhan , Gagan Raj Gupta

分类：机器学习

2022-08-21

在当今的现代数字世界中，我们有许多在线问答平台，例如Stack Exchange，Quora和GFG，它们是人们交流和互相帮助的媒介。在本文中，我们分析了堆栈溢出在帮助新手进行编程方面的有效性。该平台上的每个用户都会经历旅程。在最初的12个月中，我们认为它们是新手。在12个月后，他们属于以下类别之一：经验丰富，潜伏或好奇。每个问题都有分配给它的标签，我们观察到具有某些特定标签的问题的响应时间更快，表明该领域的活跃社区比其他领域的社区。该平台截至2013年开始稳定增长，之后它开始下降，但是最近在2020年大流行期间，我们可以在平台上看到恢复活力的活动。

translated by 谷歌翻译

Are You Comfortable Now: Deep Learning the Temporal Variation in Thermal Comfort in Winters

Betty Lala , Srikant Manas Kala , Anmol Rastogi , Kunal Dahiya , Aya Hagishima

分类：机器学习 | 人工智能

2022-08-20

智能建筑中的室内热舒适对乘员的健康和表现有重大影响。因此，机器学习（ML）越来越多地用于解决与室内热舒适的挑战。热舒适感的时间变化是调节居住者福祉和能耗的重要问题。但是，在大多数基于ML的热舒适研究中，不考虑时间中的时间方面，例如一天中的时间，昼夜节律和室外温度。这项工作解决了这些问题。它研究了昼夜节律和室外温度对ML模型的预测准确性和分类性能的影响。数据是通过在14个教室中进行的长达一个月的实地实验收集的，其中512名小学生。四个热舒适度指标被认为是深神经网络的输出，并支持数据集的向量机模型。时间变异性对学童舒适性的影响通过“一天中的时间”分析显示。预测准确性的时间差异已显示（多达80％）。此外，我们表明室外温度（随时间变化）对热舒适模型的预测性能产生了积极影响高达30％。时空环境的重要性通过对比的是微观级别（特定于位置）和宏观级别（整个城市的6个位置）的重要性。这项工作的最重要发现是，对于多种热舒适度指标，显示了预测准确性的明确提高，而天空中的时间和天空照明则有所增加。

translated by 谷歌翻译

ATP: A holistic attention integrated approach to enhance ABSA

Ashish Kumar , Vasundhra Dahiya , Aditi Sharan

分类：自然语言处理

2022-08-04

基于方面的情感分析（ABSA）涉及审查句子对给定方面的情感极性的识别。 RNN，LSTM和GRU等深度学习顺序模型是推断情感极性的当前最新方法。这些方法可以很好地捕获评论句子的单词之间的上下文关系。但是，这些方法在捕获长期依赖性方面微不足道。注意机制仅专注于句子的最关键部分，从而发挥着重要作用。在ABSA的情况下，方面位置起着至关重要的作用。在确定对该方面的情绪的同时，近乎方面的单词会做出更多的贡献。因此，我们提出了一种使用依赖解析树捕获基于位置信息的方法，并有助于注意机制。使用这种类型的位置信息通过简单的基于单词距离的位置增强了深度学习模型的性能。我们对Semeval'14数据集进行了实验，以证明基于ABSA的基于ABS的依赖关系的效果。

translated by 谷歌翻译

NGAME: Negative Mining-aware Mini-batching for Extreme Classification

Kunal Dahiya , Nilesh Gupta , Deepak Saini , Akshay Soni , Yajun Wang , Kushal Dave , Jian Jiao , Gururaj K , Prasenjit Dey , Amit Singh

分类：机器学习

2022-07-10

极端分类（XC）试图用最大的标签集中标记标签的子集标记数据点。通过使用稀疏，手工制作的功能的XC方法优越，用密集，学习的数据来进行深度XC，以数据点和标签的形式吸引了很多关注。负挖掘技术已成为所有深XC方法的关键组成部分，使它们可以扩展到数百万个标签。然而，尽管最近进步，但培训具有大型编码器体系结构（例如变形金刚）的深入XC模型仍然具有挑战性。本文确定，流行负面挖掘技术的内存通常迫使小型批量尺寸保持小且缓慢的训练。作为回应，本文介绍了Ngame，这是一种轻巧的迷你批次创建技术，可证明可证明准确的内部负面样品。这使得与现有负面采样技术相比，具有更大的迷你批次培训，提供更快的收敛性和更高的精度。发现Ngame的准确性比各种基准数据集的最先进方法要高16％，以进行极端分类，并且在回答搜索引擎查询以响应用户网页时检索搜索引擎查询更准确3％显示个性化广告。在流行搜索引擎的实时A/B测试中，Ngame在点击率率中的收益最高可达23％。

translated by 谷歌翻译